FlexiBLAS - A flexible BLAS library with runtime exchangeable backends

نویسندگان

  • Martin Köhler
  • Jens Saak
چکیده

The BLAS library is one of the central libraries for the implementation of numerical algorithms. It serves as the basis for many other numerical libraries like LAPACK, PLASMA or MAGMA (to mention only the most obvious). Thus a fast BLAS implementation is the key ingredient for efficient applications in this area. However, for debugging or benchmarking purposes it is often necessary to replace the underlying BLAS implementation of an application, e.g. to disable threading or to include debugging symbols. In this paper we present a novel framework that allows one to exchange the BLAS implementation at run-time via an environment variable. Our concept neither requires relinkage, nor recompilation of the application. Numerical experiments show that there is no notable overhead introduced by this new approach. For only a very little overhead the framework naturally extends to a minimal profiling setup that allows one to count numbers of calls to the BLAS routines used and measure the time spent therein.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A High Performance FPGA-Based Accelerator for BLAS Library Implementation

This paper describes the implementation and the performance analysis of a hardware accelerator for the BLAS library matrix multiplication operation. This accelerator is based on a dual-FPGA board and on an implementation BLAS software library making use of the FPGA-based hardware. In order to evaluate the performance of such a system, we implemented the matrix multiplication operation (BLAS “dg...

متن کامل

Exascale Ready Work-Optimal Matrix Inversion

In this thesis I present a new algorithm OPT for matrix inversion that builds on a matrix multiplication subroutine. It is combined of Strassen’s matrix inversion algorithm and Newton approximation. OPT overcomes the linear lower bound in parallel runtime of Strassen’s inversion algorithm and traditional Gaussian elimination without the log-factor more work of Newton approximation. In particula...

متن کامل

A compiler toolkit for array-based languages targeting CPU/GPU hybrid systems

This paper presents a compiler toolkit that addresses two important emerging challenges: (1) effectively compiling dynamic array-based languages such as MATLAB, Python and R; and (2) effectively utilizing a wide range of rapidly evolving hybrid CPU/GPU architectures. The toolkit provides: a high-level IR specifically designed to express a wide range of arraybased computations and indexing modes...

متن کامل

jReality - interactive audiovisual applications across virtual environments

jReality is a Java scene graph library for creating real-time interactive applications with 3D computer graphics and spatialized audio. Applications written for jReality will run unchanged on software and hardware platforms ranging from desktop machines with a single screen and stereo speakers to immersive virtual environments with motion tracking, multiple screens with 3D stereo projection, an...

متن کامل

Level and BLAS in the NAG C Library

This report describes a set of matrix vector routines Level BLAS and matrix matrix routines Level BLAS written in C These routines have been included in Mark of the NAG C Library and are used by other library routines in that library Details are given of the implementation testing and use of the routines and a complete listing of all the ANSI C function prototypes is included in the Appendix Th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013